Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian Network

نویسندگان

  • Kartik Audhkhasi
  • Angeliki Metallinou
  • Ming Li
  • Shrikanth S. Narayanan
چکیده

Automatic classification of human personality along the Big Five dimensions is an interesting problem with several practical applications. This paper makes some contributions in this regard. First, we propose a few automatically-derived personality-discriminating lexical features which provide information complementary to the conventional acoustic-prosodic cues. We also design a frame-level Gaussian mixture model based system which adds complimentary information to the systems trained on global statistical functionals. Next, we note that the Big Five dimensions are correlated and thus model the dependency between these dimensions in the form of an optimal tree-structured Bayesian network. Our final sub-system consists of within class covariance normalization followed by L1regularized logistic regression. Fusion of all these sub-systems achieves better classification performance than independently trained classifiers using just acoustic features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OPTIMIZATION OF TREE-STRUCTURED GAS DISTRIBUTION NETWORK USING ANT COLONY OPTIMIZATION: A CASE STUDY

An Ant Colony Optimization (ACO) algorithm is proposed for optimal tree-structured natural gas distribution network. Design of pipelines, facilities, and equipment systems are necessary tasks to configure an optimal natural gas network. A mixed integer programming model is formulated to minimize the total cost in the network. The aim is to optimize pipe diameter sizes so that the location-alloc...

متن کامل

Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features

This paper proposes to integrate multi-modal features using conditional random fields (CRF) for broadcast news story segmentation. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness, acoustic features involve pause duration, pitch, speaker change and audio event type, and visual fea...

متن کامل

Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles

Herein we present a comparison of novel concepts for a robust fusion of prosodic and verbal cues in speech emotion recognition. Thereby 276 acoustic features are extracted out of a spoken phrase. For linguistic content analysis we use the Bag-of-Words text representation. This allows for integration of acoustic and linguistic features within one vector prior to a final classification. Extensive...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Laughter Valence Prediction in Motivational Interviewing Based on Lexical and Acoustic Cues

Motivational Interviewing (MI) is a goal oriented psychotherapy counseling that aims to instill positive change in a client through discussion. Since the discourse is in the form of semi-structured natural conversation, it often involves a variety of non-verbal social and affective behaviors such as laughter. Laughter carries information related to affect, mood and personality and can offer a w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012